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1.  INTOODOCTION 

1.1  Overview 

This  report  describes  the  technical  and  software  procedures 
provided  by  Bedford  Research  Associates  for  the  Geomagnetism  Workshop  on 
the  topic  "Impulse  Response  of  the  Magnetosphere"  that  was  held  at  the  Air 
Force  Geophysics  Laboratory  (AFGL)  at  Hanscom  Air  Force  Base  from  December 

2,  1981  to  December  4,  1981. 

1.2  Technical  Background 

The  purpose  of  this  workshop  was  to  study  a  single  "SSC”  (storm 
sudden  commencement)  with  information  obtained  from  as  many  magnetometer 
networks  as  possible.  An  SSC  results  when  a  flare  or  other  explosive  event 
on  the  sun  generates  an  interplanetary  travelling  shock  wave  (a  blast) . 
This  shock  propagates  at  speeds  in  the  order  of  100  km/sec  and  arrives  in 
the  vicinity  of  earth  about  40  hours  later.  When  the  shock  strikes  the 
magnetopause, ■ the  result  is  a  sudden  conpression  of  the  magnetosphere  with 
an  attendant  sudden  jump  in  the  magnetic  field.  Such  a  jump  propagates, 
presumably  in  the  form  of  a  hydronegnetic  wave,  from  the  outer 
magnetosphere  to  the  ionosphere  where  it  sets  up  a  current  system.  The 
ionospheric  current  system  induces  a  sharp  charge  in  the  earth's  magnetic 
field  and  is  visible  almost  simultaneously  over  the  entire  earth  as  a  very 
sharp  haze  in  the  magnetogram.  With  modern  instrumentation,  using 
high-time  resolution  magnetometers,  the  SSC  can  be  observed  to  propagate 
over  the  ground.  In  a  coordinated  data  study,  the  world-wide  response  to 
such  an  event  can  be  studied  in  great  detail.  The  equivalent  ionospheric 
current  system  can  be  mapped  out,  the  nature  of  the  "SSC"  wave  can  be 
greatly  clarified,  and  indeed,  the  "impulse  responses"  of  the  magnetosphere 
can  be  studied. 


1.3  Programming  background 

Prom  a  programming  point  of  view,  this  project  involved  two  parts. 
Part  One  consisted  of  designing  a  data  base  suitable  for  this  data  set  and 
the  functional  procedures  to  be  performed,  than  inserting  the  data  into  the 
data  base. 

Part  Two  involved  design  and  coding  of  the  data  analysis  modiles 
to  be  used  during  the  confer  Mice. 

1.4  Outline  of  Report 

This  report  is  divided  into  seven  sections.  Section  1  serves  as 
an  introduction  and  provides  background  material  on  the  report.  Section  2 
describes  the  processing  of  the  raw  data  into  a  data  base  suitable  for  the 
initiator's  use  at  the  conference.  Section  3  contains  a  description  of  the 
data  analysis  modules  created  for  the  project,  while  Section  4  explains  the 
preliminary  tasks  performed  prior  to  the  conference.  The  conference, 
itself,  is  described  in  Section  5,  and  the  conclusions  resulting  from  the 
conference  are  presented  in  Section  6.  A  final  summary  is  given  in  Sec¬ 
tion  7. 


2.  DATA  PROCESSING 

2.1  Overview 

This  project  involved  the  formation  of  the  largest  on-line 
Geoiragnetic  data  base  ever  assembled  for  a  specific  conference.  The  size 
of  the  data  base  made  it  possible  to  formulate  more  general  conclusions 
since  a  larger  and  more  diverse  sample  set  was  available. 

2.2  Data  Description 

The  raw  data  tapes  were  sent  to  Bedford  Research  Associates  from 
various  participants.  These  tapes  had  a  multitude  of  formats  and 


specifications.  The  data  values  were  components  of  a  vector,  x,  y  and  z 
fro®  a  station  site  for  a  specified  tine  range. 


Ten  different  participants  front  around  the  globe  contributed  this 
data?  the  number  of  station  sites  totaled  95.  In  order  to  increase  the 
sample  set,  ten  events  were  selected  for  study.  Each  station  site 
contributed  data  for  at  least  one  event,  while  some  provided  data  for  all 
ten  events. 


2.3  Data  Base  Formation 

Because  the  data  base  was  to  be  used  on-line,  space  was  one  of  the 
major  considerations  in  forming  the  data  base.  Ten  separate  data  files 
were  formed,  since  the  ten  events  were  totally  unrelated  and  there  would  be 
no  need  to  draw  on  data  from  different  events  at  the  same  time. 

The  data  bases  were  divided  into  two  types  of  records:  1)  header 
records  containing  pertinent  information  about  the  data,  and  2)  data 
records  containing  just  packed  data  values.  Constructing  the  data  base  in 
this  manner  enables  one  to  search  for  information  simply  by  scanning  the 
header  record.  This  reduces  the  running  time  needed  for  data  analysis  and 
gives  faster  results  to  the  on-line  user.  Packing  the  data  values  in  a 
suitable  form  greatly  reduces  the  size  of  the  data  base. 

For  a  detailed  description  of  the  data  base  structure,  see 
APPENDIX  I. 


2.4  Problems  Encountered 

The  main  cause  of  problems  stemmed  from  the  diversity  of  data 
structures.  Each  of  the  ten  participants  supplied  data  in  a  different 
format,  e.g.  ASCII,  EBCIDIC,  character  format,  16  or  32  bit  binary,  one 
component  at  a  time,  or  3  components  at  a  time.  Some  participants  used 
different  formats  for  different  events. 
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Each  participant  was  asked  to  submit  a  sample  data  tape  in  advance 
so  that  Bedford  Research  might  design,  code  and  test  a  data  conversion 
program  for  putting  data  into  the  data  base.  Participants  were  also  asked 
to  said  listings  or  plots  of  the  data  for  verification  purposes.  Some  were 
unable  to  cooply  with  these  requests.  A  time  table  was  formulated  for 
preparation  of  the  conference  and  a  date  was  set  for  final  submission  of 
data  by  participants.  Some  data  arrived  late  but  was  still  processed  in 
time  for  inclusion  in  the  conference. 


3.  DATA  ANALYSIS 

3.1  Overview 

In  order  to  analyze  the  data  in  the  most  effective  way,  a  group  of 
statistical  and  data  display  procedures  were  selected.  The  ability  to 
produce  a  listing  of  any  data  segment  and  also  to  graphically  present  this 
data  were  desirable  features  for  direct  interpretation  of  the  data. 
Conference  participants  should  also  be  able  to  do  filtering,  power  spectra, 
cross  spectra  and  hodograms  of  the  data. 

The  first  step  involved  production  of  data  retrieval  routines  that 
would  be  time  efficient.  This  acconplished,  the  design  and  coding  of  the 
actual  analysis  modules  could  be  ccnpleted. 

3.2  Analysis  Programs 
3.2.1  Printing  the  Data 

The  display  of  the  data  values  in  printed  form  was  one  of  the  most 
straightforward  modules.  By  requesting  a  station  site,  date,  start  time, 
end  time  and  data,  type,  the  specific  data  segment  would  be  printed  out. 
The  available  data  types  consisted  of  component  x,  y,  z  or  the  magnitude  of 
any  components  taken  -  one,  two  or  three  at  a  time.  (See  Figure  3.1.) 
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3.2.2  Plotting  the  Data 

The  plotting  module  of  the  data  analysis  has  the  ability  to 
produce  standard  plots  or  user-customized  plots.  The  plots  can  be  produced 
either  on  interactive  graphics  or  on  microfiche.  The  standard  plots  are 
multiple  axis  plots  with  axis  scaling  to  fit  the  data,  three  plots  per 
frame  corresponding  to  the  different  components  for  an  entire  data  file. 
(See  Figure  3.2.) 

To  customize  the  plots,  several  options  are  available:  station 
site,  conponent  or  magnitude,  and  manual  selection  of  axis  scaling  and  base 
line.  The  final  option  is  production  of  single  axis  plots.  (See  Figure 
3.3.)  This  procedure  can  produce  from  1-10  plots  on  a  single  axis. 

3.2.3  Filtering  Data 

The  filtering  module  was  actually  three-fold.  The  first  part 
would  filter  the  data  to  change  the  sampling  rate.  Because  different 
participants  used  data  at  different  sampling  rates,  it  became  necessary  to 
design  and  construct  a  filter  module  that  would  create  a  uniform  delta-time 
data  base.  Also,  some  of  analysis  required  changing  the  sanpling  rate. 

The  second  part  was  the  construction  of  conmonly  used  filters:  low 
pass,  high  pass  and  band  width.  The  filters  would  be  used  on  a  data  set. 

The  third  part  was  making  a  filter  design  program  available  to 
participants  so  they  could  design  their  own  filters  for  use  on  a  data  set. 

3.2.4  Power  Spectra  Analysis 

A  power  spectra  analysis  module  that  would  take  the  power  spectra 
of  a  data  set  was  put  together.  It  was  designed  to  print  out  a  table  of 
spectral  characteristics  and  also  produce  custom  plots  with  variable  power 
and  frequency  axis  ranges.  (See  Figures  3.4  and  3.5.) 


3.2.5  Cross  Spectra  Analysis 


A  cross  spectra  analysis  module  with  the  ability  to  do  power 
spectral  estimates  for  up  to  3  station  sites  was  created.  The  module  would 
also  produce  polarization  parameters  and  cross  spectral  parameters  * 

3.2.6  Hodogram 
See  Figure  3.6 


3.3  System  Structure 

A  task-oriented  system  was  constructed  to  access  all  the  above 
modules.  The  system  would  perform  the  analysis  module  on  the  desired  data 
file,  access  all  the  necessary  system  and  data  files,  and  produce  the 
desired  output  files.  With  this  system,  the  user  could  do  an  entire 
analysis  nodule  by  issuing  only  one  key-word  command.  This  system  also 
allowed  the  user  to  do  a  stream  of  analysis,  i.e. ,  concatenate  analysis 
modules. 

System  modules  were  also  added.  One  of  these  modules  could 
construct  a  new  data  set  by  merging  two  old  data  sets.  Another  could 
change  any  information  in  the  header  record  in  a  data  set.  For  example, 
the  user  could  change  the  station  site  name  to  one  that  was  more 
descriptive.  Finally,  there  was  a  module  for  listing  out  the  station  site 
information  of  all  the  data  in  the  data  bank.  (Figure  3.7) 


4.  PREPARATION  FOR  THE  WORKSHOP 

4.1  Overview 

Bedford  Research  Associates  received  the  contract  for  this  work  at 
the  APGL  conference  in  January  1981. 


General  specifications  on  the  proposed  analysis  were  received  by 
the  end  of  January.  The  system  design  was  begun  at  that  time,  with  the 
interface  portions  omitted  until  the  data  base  structure  could  be 
determined. 

By  the  middle  of  February,  participants  began  sending  in  some 
sample  data  tapes.  With  these  as  a  guideline,  a  first  draft  of  the  data 
base  was  constructed. 

By  mid-March  the  final  data  base  structure  was  designed  and  a 
sample  data  base  constructed.  The  printing  and  plotting  analysis  modules 
were  completed  and  tested. 

Hie  months  of  March  till  the  end  of  October  were  spent  loading 
data  into  the  data  base  and  creating  the  other  analysis  modules. 

4.2  Test  Case  Analysis 

During  the  first  two  weeks  in  November,  a  final  test  case  of  the 
entire  system  was  run.  Every  conceivable  path  of  analysis  was  tested  for 
complete  accuracy.  Also,  every  possible  means  of  bringing  down  the  system 
or  "fooling"  it  was  tried  exit  in  order  to  troubleshoot  any  problems  that 
might  occur  during  the  conference. 


4.3  Analysis  Performed 

The  last  two  weeks  in  November  were  used  for  production  runs  that 
would  be  performed  prior  to  the  conference.  This  included  plotting  of  all 
the  data  in  the  data  base  and  forming  a  booklet  of  the  plots. 

For  analysis  purposes,  it  was  decided  to  have  all  the  data  at  a 
uniform  sampling  rate.  Therefore,  new  data  bases  were  created  from  the 
original  by  means  of  the  filtering  module. 

Standard  low  pass,  high  pass  and  band  width  filters  were  also 
constructed  for  the  conference. 


5.1  Overview 


The  workshop  ran  quite  smoothly,  with  language  being  the  only 
major  problem.  Since  participants  cane  from  around  the  world,  coanuni- 
cation  was  sometimes  difficult. 

5.2  Analysis  Performed 

Of  the  ten  events  available  to  participants,  four  were  determined 
to  be  of  special  interest  and  were  selected  for  the  conference.  All  the 
analysis  modules  used  were  highly  successful. 

6.  CONCLUSIONS 

6.1  Programing  Standpoint 

Prom  a  programming  standpoint,  the  conference  was  a  total  success. 
The  participants  were  quite  impressed  and  were  pleased  with  the  ability  to 
perform  a  number  of  analysis  techniques  on  such  a  diverse  data  base.  The 
system  ran  perfectly  and  efficiently  and  was  able  to  give  the  participants 
on-the-spot  investigative  tools  for  any  and  all  suppositions  brought  up  at 
the  conference. 

6.2  Technical  Standpoint 

From  information  gathered  at  the  conference,  participants  were 
able  to  formulate  outlines  for  a  number  of  technical  papers.  Because  of 
time  limitations  at  the  conference,  final  conclusions  could  not  be  drawn. 
However,  participants  will  be  interacting  with  the  data  base  via  mail  and 
plan  to  reconvene  in  May  to  cotrplete  the  writing  of  the  technical  papers. 

Some  of  the  participants  have  also  submitted  additional  data  that 
has  since  been  added  to  the  data  base  for  further  analytical  studies. 
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7.  SUWARY 


This  conference  was  highly  successful  from  all  points  of  view.  It 
gave  the  participants  an  opportunity  to  collectively  form  conclusions  about 
the  topic  "inpulse  Response  of  the  Magnetosphere"  and  to  test  out  these 
conclusions  by  on-line  access  to  a  large  data  base.  It  also  provided  the 
opportunity  to  gather  information  for  personal  as  well  as  group 
suppositions.  The  conference  served  as  a  starting  point  for  investigating 
suppositions  and  furnished  a  large  data  base  which  will  be  available  for 
further  investigation  after  the  conference. 


APPENDIX 


HEAnER  RECORD 

Consists  of  14  CDC  integer  words  defined  as  follows 

1  -  A10  station  name 

2  *  15  start  data  in  form  YYJJJ 

3  *  15  aid  data  in  form  YYJJJ 

4  =  16  start  time  (OT)  HHMMSS 

5  =  16  end  time  (UT)  HHJWSSS 

6  =  numerator  of  samples/second 

7  =  denominator  of  samples/ second 

8  =  shift  amount  for  X-coirponent 

9  =  shift  amount  for  Y  component 

10  =  shift  amount  for  Z -component 

11  =  scaling  factor  for  X-conponent 

12  =  scaling  factor  for  Y-conponent 

13  *  scaling  factor  for  Z-conponent 

14  =  default  value  for  this  station 

DATA  RECORD 

Consists  of  384  CDC  words  packed  five  values/word  in  the  following 
format: 


word  12  3 


Ill  CONVERT  FROM  REAL  VALUES  TO  UNBACKED  DATA  VALUES 


a)  Final  MID  Value  *»  (max  +  min)/2 

b)  Determine  range  of  values 

RANGE  =  max  -  min. 

c)  By  using  a  scale  factor,  adjust  range  to  be  less  than  4,000. 

d)  Do  a,  b,  c  above  for  all  3  components,  then  form  HEADER  RECORD 
where 

HEAD  (11)  =  -  Scale  Factor  for  X 
HEAD  (12)  =  -  Scale  Factor  for  Y 
HEAD  (13)  =  -  Scale  Factor  for  z 

If  scaling  operation  to  adjust  the  range  is  division,  then 

HEAD  (11)  =  Scale  Factor  for  X 
HEAD  (12)  =  Scale  Factor  for  Y 
HEAD  (13)  =  Scale  Factor  for  Z 

e)  Buffer  out  Header  (1)  -  Header  (14)  to  tape  2. 

f)  Form  integer  data  array  of  adjusted  values  by 

adjusted  =  [original  -  MID  Value] 

in  the  multiplication  scale 
divisional  scale 

in  the  form  of  triplets  for  each  Dt. 

DATA  (1)  =  X  (t .  =  start  +  iDt) 

DATA  (2)  =  Y  h  t 
DATA  (3)  =  Z 
DATA  (4)  =  X 
DATA  (5)  =  Y  k  t. 

DATA  (6)  =  Z 

g)  Where  there  are  1920  data  values  in  the  array  (i.e.,  640  values  of 
X,Y,Z  respectively),  call  subroutine  FILL. 

The  calling  sequence  is 

Call  Fill  (DATA, NUM, TOTAL, HEADER) 

DATA  =  1920  or  less  integer  values. 

NUM  =  number  of  data  values  in  array  data  (Integer) 

ITOTAL  =  returned  value  of  number  of  data  records  buffered 
out.  ITOTAL  mast  be  originally  set  to  zero  and  is 
incremented  by  1  for  each  call  to  fill. 

HEADER  =  default  value 
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3.6 


MINPTEMftN  DKTA  ANALYSIS 


1.  INTRODUCTION 


The  Minuteman  database  consists  of  recordings  taken  from  20 
different  sites.  These  sites  sure  labelled  102-111  and  R02-111.  The  time 
span  for  all  sites  where  data  was  recorded  is  December  1,  1979  to  March 
31,  1980. 


Each  site  has  ten  readings  or  variables.  These  variables  consist 
of  X,  Y  and  z  platform,  instrument,  and  gyro,  plus  a  phi-N  measurement. 
Each  variable  has  a  sampling  rate  of  seven  hours  and  fifteen  minutes  (+  30 
minutes) .  The  sampling  times  for  all  ten  variables  of  one  site  axe  the 
same,  but  each  site  has  different  sampling  times.  Consequently,  each  site 
has  gaps  in  the  data  unique  to  that  particular  site. 


2.  OBJECTIVES 

The  current  analysis  of  the  Minuteman  database  was  directed  toward 
locating  consistent  structures  in  the  data.  All  analysis  was  focused  on 
relationships  between  variables  of  the  same  site  that  were  exhibited  in  a 
large  proportion  of  the  twenty  sites.  Direct  relationships  between 
variables  of  different  sites  were  not  studied.  For  example,  a  coherence 
function  using  X -platform  data  from  site  102  and  103  as  input  was  not 
performed. 

There  was  no  effort  to  detect  or  analyze  events  in  the  data.  An 
event  is  defined  as  one  or  more  large  impulses  or  oscillations  relative  to 
the  majority  of  data  values  in  a  continuous  "stream"  of  data.  (Figure  1A 
depicts  an  "event"  occurring  from  days  73  to  80) . 
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It  was  desired  to  eliminate  any  redundant  variables  by  observing 
consistently  high  correlation  values  among  variables  in  a  site.  A  multiple 
step-wise  regression  holding  phi-N  dependent  will  be  implemented  to  attain 
this  goal.  Power  spectra  and  coherence  functions  attained  by  using  an  FFT 
algorithm  should  detect  any  strong  frequencies  relating  to  the  variables. 


3.  ANALYSIS 

Hie  first  effort  consisted  of  plotting  the  ten  variables  of  site 
102.  This  site  was  chosen  at  random  to  get  some  idea  of  the  data  struc¬ 
ture.  Hie  plots  revealed  that  each  variable  consists  of  roughly  evenly- 
spaced  stretches  (i.e. ,  no  gaps)  of  data  with  an  average  of  fifteen  points 
or  a  spanning  of  approximately  five  days.  Hiese  "stretches"  were  inter - 
rupted  by  gaps  of  approximately  one  day,  but  occasionally  as  long  as  twenty 
days.  Also,  each  stretch  of  evenly-spaced  samples  appeared  to  have  its  own 
bias  associated  with  it.  A  check  of  the  output  listing  on  all  twenty  sites 
confirmed  this  trait. 

A  multiple  regression  algorithm  was  implemented  at  this  time.  Hie 
nine  variables  (X,  Y,  and  Z  platform,  instrument  and  gyro)  were  the  inde¬ 
pendents,  while  phi-N  was  held  as  the  dependent  variable.  Each  evenly- 
spaced  stretch  of  observations  for  all  sites  and  variables  had  its  associ¬ 
ated  bias  removed  from  it.  Hie  results  indicated  an  extremely  strong 
correlation  between  X-platform  and  the  X,  Y,  and  Z  instrument  measurements. 
When  the  variables  were  not  forced  into  the  regression  model,  one  of  the 
four  variables  mentioned  above  was  usually  entered  into  the  model.  Hie 
squared  multiple  correlation  coefficient  (R2)  was  usually  .05  for  the 
twenty  sites  with  a  neximum  of  .15. 

Hie  regression  package  was  applied  to  all  twenty  sites  forcing  the 
nine  variables  into  the  model.  Hie  results  were  not  significantly  differ¬ 
ent  from  those  with  no  forcing. 
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The  tables  depicting  the  regression  results  were  drawn.  Table  1A 
illustrates  which  variables  were  altered  into  the  unforced  step-wise  model 
and  the  order  in  which  they  were  inserted  for  each  site.  Also,  the  squared 
multiple  correlation  coefficient  for  each  site  is  listed.  Table  IB  has  a 
similar  setup  (sites  versus  variables) ,  except  that  it  indicates  pairs  of 
variables  whose  cannon  correlation  coefficient  is  greater  than  0.4.  Table 
1C  contains  counts  of  two  variables  with  a  correlation  coefficient  larger 
than  0.4  out  of  the  20  sites. 

Upon  inspection  of  the  time  gaps  of  all  sites,  it  was  observed 
that  a  large  gap  exists  between  December  19  and  January  15  for  all  sites. 
All  subsequent  analysis  was  performed  only  on  data  occurring  after  this  gap 
since  no  similar  gaps  exist  from  January  15  to  March  31.  The  data  base  was 
further  reduced  by  eliminating  the  three  instrument  variables.  This  was 
done  because  of  the  large  and  consistent  correlation  of  these  variables  to 
the  X-platform. 

TVo  approaches  were  decided  upon  at  this  time:  first,  regression 
analysis  on  the  reduced  database  with  forcing  only  on  the  X-platform,  and 
second,  spectra  analysis  and  coherence  functions  on  the  X-platform  and  the 
phi-N  variable.  These  two  variables  are  linked  by  stronger  correlation 
values  between  the  X-platform  and  instrument  readings  to  the  phi-N  measure¬ 
ments  relative  to  the  reraining  variables. 

The  results  of  the  regression  analysis  on  the  reduced  database  is 
depicted  in  Tables  2A,  2B,  and  2C.  Their  structure  is  similar  to  Tables 
1A,  IB,  and  1C,  respectively.  The  results  revealed  no  consistent  correla¬ 
tions  between  any  of  the  seven  remaining  variables.  No  increase  in  the 
squared  correlation  coefficient  was  observed. 

The  reason  for  the  spectra  analysis  was  to  check  whether  the  sig¬ 
nals  have  low  frequency  peaks.  If  this  is  the  case,  then  a  low-pass  filter 
can  be  implemented  to  reduce  higher  frequency  noise.  Regression  can  be 
performed  on  the  filtered  data.  Hopefully,  this  will  yield  better  correla¬ 
tion  and  R2  results. 


Before  spectra  analysis  (i.e. ,  FFT)  can  be  applied  to  the  data, 
the  signal  samples  must  be  evenly  spaced  in  time.  The  Minutenan  data  must 
have  all  gaps  interpolated  before  an  FFT  can  be  applied.  A  linear  predic¬ 
tion  routine  was  used  to  interpolate  for  missing  data.  Specifically, 
forward  and  backward  prediction  was  used  with  weights  determined  by  the 
estimated  variance  associated  with  the  measured  data  points. 

The  gaps  in  all  X-platform  and  phi-N  signals  were  filled.  The 
coherence  and  power  spectrum  plots  were  then  obtained.  The  results  of  the 
plots  displayed  only  one  prominent  characteristic.  A  large  "spike"  was 
evident  in  the  spectra  of  the  X-platform  at  a  frequency  of  one  cycle  per 
21.75  hours. 

The  spike  was  observed  in  all  X-platform  plots  but  was  not  con¬ 
sistent  in  either  the  coherence  plots  or  the  phi-N  plots.  This  feature  was 
consistent  with  a  parameter  associated  with  sampling  times.  Each  time  data 
was  recorded  a  "direction"  (east  or  west)  associated  with  that  observation. 
Every  third  observation  was  associated  with  an  east  orientation.  Upon 
close  inspection  of  the  data,  especially  the  X  and  Y  platforms,  it  was 
noted  that  different  bias  levels  existed  for  each  direction.  Since  the 
sampling  rate  was  7.25  hours  and  the  bias  occurred  in  cycles  of  three 
observations,  a  frequency  of  one  cycle  per  21.75  hours  should  have  been  and 
indeed  was  prominent  in  the  spectra. 

An  algorithm  was  designed  to  remove  this  east-west  bias.  The  data 
for  each  stretch  of  evenly-spaced  data  was  grouped  by  its  direction,  and 
the  mean  for  each  group  was  calculated  and  subsequently  removed.  This  was 
performed  on  all  variables  for  all  sites  and  regression  analysis  was 
reapplied.  The  results  showec  some  improvement  in  the  correlation  values 
between  the  X  and  Y  platform  and  the  dependent  variable  phi-N.  No  improve¬ 
ment  was  observed  in  the  R2  values.  Power  spectra  and  coherence  plots  on 
this  database  illustrated  the  attenuation  of  the  frequency  component  due  to 
the  east-west  biasing,  but  no  other  frequencies  were  significantly  stronger 
on  a  consistent  basis. 


2 


fi 


Since  the  coherence  function  has  a  range  of  0  to  +1,  all  coherence 
plots  could  be  averaged  together.  Coherence  averaging  plots  were  generated 
for  cases  with  bias  present  as  well  as  removed.  The  results  from  both  plots 
revealed  no  significant  frequencies  common  to  the  X-platform  and  phi-N 
measurements . 

4.  CONCLUSIONS 

Thirty  percent  of  the  database  can  be  reduced  by  eliminating  the 
three  instrument  variables.  This  is  true  because  of  the  consistent  high 
correlation  between  the  X-platform  variable  and  the  three  instruments.  The 
database  can  be  further  reduced  by  ignoring  samples  prior  to  January  15. 
Any  subsequent  analysis  should  take  into  account  these  features. 

No  significant  relationships  were  exhibited  between  the  remaining 
independent  variables  and  phi-N.  It  should  be  noted  that  only  structures 
corrmon  to  all  twenty  sites  were  sought. 

The  power  spectra  indicated  spikes  randomly  dispersed  through  the 
frequency  spectrum.  This  ruled  out  any  digital  filtering  of  the  data  to 
improve  correlation  values;  real  spikes  could  not  be  distinguished  from 
noise.  The  spectra  indicated  different  bias  levels  only  in  association 
with  the  east-west  orientations. 

It  should  be  repeated  that  all  studies  were  made  on  relationships 
between  variables  of  the  same  site  that  could  be  validly  extended  to  all 
sites.  There  was  no  detection  or  analysis  of  events  occurring  in  vari¬ 
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1.  INTRODUCTION 

An  attempt  is  made  to  develop  regression  terms  and  quantify  the 
effects  of  mountain-induced  disturbances  in  the  lower  atmosphere.  This 
information  will  be  used  to  improve  an  existing  model  for  predicting  the 
occurrences  of  turbulences  in  the  lower  atmosphere.  Murphy  and  Scharr 
(1981)  and  Murphy  et  al  (1982)  contain  further  explanations  of  the  tech¬ 
nique  for  determining  turbulence  estimates  and  the  present  model  for  pre¬ 
dicting  occurrences. 

The  gradient  Richardson  number,  used  as  the  dependent  variable  in 
the  regression  analysis,  is: 

(g/p)  (dT/dz  +  rd  ] 

Ri  =  _ 

(  <?V  a z)2  +  (  «?vy/  az)2 

where  g  is  the  acceleration  of  gravity,  T  is  the  absolute  temperature, 
tfr/dz  is  the  vertical  temperature  gradient,  rd  is  the  dry  adiabatic  lapse 
rate  (9.7°  K/Km)  and  the  denominator  is  the  square  of  the  vertical  shear  of 
the  horizontal  wind.  Rawinsonde  measurements  of  wind  and  temperature  as  a 
function  of  altitude  are  used  to  determine  Richardson  numbers.  The 
Richardson  number  values  as  a  function  of  altitude  are  calculated  at  1  km 
levels  from  2  km  to  30  km  for  each  of  two  daily  height  profiles  of  wind  and 
temperature.  This  provides  from  100  to  approximately  180  values  of  Ri  for 
each  altitude  bin  per  season.  The  percent  occurrence  of  turbulence  is  ob¬ 
tained  by  taking  the  ratio  of  the  number  of  occurrences  of  Ri  ^  1.0  to  the 
total  number  of  measurements  per  season.  The  likelihood  of  occurrence  of 
turbulence  is  then  based  on  the  relative  frequency  of  occurrence  of  a 
critical  Richardsori  number  (Ric  =  1.0). 
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The  present  model  for  predicting  turbulence  has  been  established  for 
the  following  ranges:  lower  range  (2-7  km),  middle  range  (8-13  km)  and 
upper  range  (14-19km).  Model  variables  include  location  type  (latitude, 
longitude,  altitude),  transformations  of  them  (lat*,  long^,  altk) ; 
i,j=0,l,2?  k=0,l,2,3),  cross  products  terms  (lat*,  long^,altk)  and  seasonal 
indicators.  The  dependent  variable,  an  indicator  of  turbulence,  is  based 
on  percent  occurrences  of  the  Richardson  number  less  than  unity.  By  using 
location  and  seasonal  information  we  have  been  able  to  explain  a  substan¬ 
tial  amount  of  the  variation  in  the  turbulence  indicator  (38%  in  the  lower 
range,  61%  in  the  middle  range,  and  42%  in  the  upper  range) .  This  has  been 
done  for  a  sample  21  stations  chosen  as  representative  of  the  continental 
United  States.  Also,  another  important  result  of  that  study  is  that  yearly 
variations  in  the  occurrences  of  Ric  are  found  to  be  small  (less  than  1%  of 
the  total  variation).  Evidence  of  mountain-induced  disturbances  are  pre¬ 
sented  and  improvements  are  made  in  the  model  through  the  use  of  terms  to 
quantify  terrain  aspects. 


2.  EVIDENCE  OF  MOUNTAIN-INDUCED  DISTURBANCES  -  A  COMPARISON  OF  DATA  SETS 

A  qualitative  analysis  performed  on  many  of  the  data  reporting 
stations  (81  in  the  continental  United  States)  revealed  that  the  height 
profiles  of  percent  occurrences  of  Ri  i  1.0  for  stations  in  the  proximity 
of  mountain  ranges  are  markedly  different  from  those  stations  in  the  plains 
areas  of  the  United  States.  This  was  noted  from  visual  observation  of 
microfiche  computer  plots. 

Three  sets  of  data  are  used  to  describe  mountain-induced  distur¬ 
bances  in  the  atmosphere.  Each  set  consists  of  3  stations  at  nearly  the 
same  latitude.  In  each  set,  two  measuring  stations  are  in  the  midwestern 
plains  and  one  is  in  the  western  mountain  region.  The  one  exception  to 
this  is  Chatham,  Massachusetts,  which  is  here  considered  a  plains  station. 
Only  nine  stations  from  the  81  stations  in  the  continental  United  States 
are  suitably  located  to  provide  this  kind  of  comparison.  In  the  first  set. 
Figure  1,  the  three  stations  used  for  comparison  are  within  approximately  2 
degrees  of  latitude.  The  two  plains  stations,  Sault  St.  Marie,  Michigan 
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and  International  Falls,  Minnesota  are  plotted  together  in  Figure  la.  The 
degree  of  closeness  of  fit  for  the  two  profiles  of  percent  occurrence  of 
Rii  1  are  evident.  This  is  true  for  all  four  seasons.  Note  that  the  pro¬ 
files  for  each  season  are  for  the  five-year  averages  (1971-1975) .  Each  of 
the  years  compared  separately  is  also  close.  This  is  true  since,  as  was 
previously  stated,  the  yearly  contribution  to  variation  in  percent  occur¬ 
rence  of  turbulence  (using  Ri  as  an  indicator)  was  found  by  regression  to 
be  less  than  one  percent.  The  higher  percent  occurrences  for  the  mountain 
site,  Great  Falls,  Montana,  shown  in  Figure  lb,  continues  up  to  approxi¬ 
mately  15  km.  In  fact,  percent  occurrences  cure  consistenly  higher  in  all 
of  the  five  years  from  1971-1975. 

The  two  plains  stations  of  the  second  set,  Topeka,  Kansas  and 
Dayton,  Ohio,  are  plotted  together  in  Figure  2a.  The  mountain  station  for 
this  set  is  Denver,  Colorado,  which  is  plotted  for  comparison  along  with 
Topeka,  Kansas  in  Figure  2b.  Here  again  there  is  fairly  good  agreement 
between  the  two  plains  stations,  whereas  there  are  differences  as  high  as 
25%  between  the  plains  and  mountain  stations.  These  differences  are  found 
as  high  as  8  km.  The  three  stations  in  this  set  are  located  within  a 
degree  of  latitude. 

In  the  third  set,  the  three  stations  are  separated  by  just  over 
one  degree  in  latitude.  The  plains  stations,  Chatham,  Massachusetts  and 
Flint,  Michigan,  are  plotted  together  in  Figure  3a  and  are  remarkably 
close.  In  comparison,  the  percent  occurrences  of  the  mountain  station. 
North  Platte,  Nebraska  (Figure  3b),  are  seen  to  be  markedly  higher  than  the 
plains  station,  Flint,  Michigan.  This  is  true  up  to  a  height  of  5  km. 


3.  TERRAIN  EFFECTS  ANALYSIS  -  STATISTICAL  APPROACH 

Appropriate  statistical  techniques  were  employed  to  determine 
significance  and  to  develop  the  improved  model.  Using  the  21  stations  that 
determined  the  original  model,  the  following  points  are  considered  in  an 
effort  to  quantify  the  terrain  effects: 


(a)  Classification  of  terrain  features 

(b)  Creation  of  terrain  type  variables 

(c)  Determination  of  effects 

(a)  Classification 

It  is  apparent  from  a  topographic  map  of  the  United  States  that 
there  are  three  types  of  terrain  -  the  coasted  sea  level  areas,  the  central 
plains  section  and  the  western  mountain  region,  initial  classification  was 
done  by  grouping  stations  within  the  eastern,  central,  and  western  section. 
To  determine  if  terrain  effects  were  significant,  a  two-way  analysis  of 
variance  (ANCMA)  test  was  performed  for  each  altitude  bin  (Draper  and 
smith,  1981) .  The  range  of  the  mean  differences  in  turbulence  within  each 
altitude  bin  for  the  different  terrain  groups  was  of  interest.  This  sta¬ 
tistical  test  considers  mean  differences  between  groups  using  the  test 
statistic  F.  Results  of  the  two-way  ANOVA  on  the  2-7  km  and  8-13  km  bins 
are  presented  (Table  1).  Examination  of  the  F-statistic  reveals  that  there 
are  overall  significant  differences  (p«.001)  between  the  defined  terrain 
groups  (Table  2). 


(b)  Creation  of  terrain  variables 

In  order  to  include  the  effects  of  terrain  in  the  model,  two  vari¬ 
ables  were  created  to  indicate  the  type  of  terrain: 


II  if  station  is  in  sea  level  category 
0  otherwise 


mountain 


1  if  station  is  in  mountain  category 
0  otherwise 


A  plains  station  would  then  be  indicated  by  sea  level  =  0  and 
mountain  =0.  A  multiple  stepwise  least  squares  regression  procedure  was 
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employed  to  determine  the  significance  of  terrain  differences  on  turbu¬ 
lence.  The  set  of  independent  variables  candidates  for  entry  into  the 
turbulence  model  include  the  previous  terms  from  the  existing  model  plus 
sea  level  and  mountain  terms.  Based  on  the  AN3VA.  results,  the  regions  2-7 
km  and  8-13  km  were  selected  for  modeling. 

(c)  Determination  of  effects 

In  the  2-7  km  range,  the  multiple  correlation  coefficient  squared 

2 

(R  )  increased  by  nearly  10%  over  the  model  without  terrain  indicators. 
Thus,  a  significant  improvement  in  the  turbulence  model  is  represented. 

There  was  a  negligible  contribution  to  turbulence  explained  in  the 
8-13  km  model  using  the  terrain  indicator  variables.  This  was  true  even 
though  the  ANDVA  test  in  that  range  determined  significant  differences  be¬ 
tween  terrain  groups.  The  explanation  for  this  is  that  turbulence  varia¬ 
tions  due  to  differing  terrain  types  occur  at  varying  altitudes  in  the  8-13 
km  range,  thus  accounting  for  the  ANCJVA  result.  Because  there  is  no  sus¬ 
tained  effect  at  any  given  altitude,  the  terrain  variables  do  not  add  sig¬ 
nificant  information  to  explain  turbulence  in  the  8-13  km  bins.  Occurrences 
of  turbulence  estimates,  based  on  a  critical  Richardson  nunber  of  1.0,  are 
shown  in  comparison  to  model  predicted  values  in  Figure  4.  These  results 
from  three  representative  stations,  two  in  the  mountain  and  one  in  the 
plains  region,  clearly  demonstrate  the  improvements  in  the  model  with  the 
added  terrain  terms. 


4.  SUMMARY 

a.  An  independent  variable,  quantifying  terrain  characteristics  in 
the  general  region  of  station  location,  improves  turbulence  estimates  in 
the  2-7  km  altitude  range.  The  accuracy  of  the  estimates  is  increased  by 
10%  over  the  model  without  terrain  indicators. 


b.  Turbulence  estimate  above  8  km  altitude  are  not  sicpificantly 
inproved  by  knowing  the  type  of  terrain. 

c.  There  is  evidence  to  suggest  that  a  more  localized  effect  is 
present.  The  average  heights  of  the  mountains  within  a  radius  of  several 
hundreds  of  km  of  a  station  appear,  from  a  qualitative  inspection,  to  be 
related  to  the  degree  of  activity  in  the  2-7  km  altitude  region.  This, 
however,  is  difficult  to  quantify  and  further  efforts  will  be  applied  in 
this  direction. 


taste  1 


TWO-WAY  ANALYSIS  OF  VARIANCE  RESULTS 


2-7.  .fan 

Winter 

Spring 

Sumner 

fall 

terrain 

2,108 

10.91 

(7.41)* 

4.99 

(7.41) 

28.28 

(7.41) 

24.54 

(7.41) 

alt.  level 

5,108 

4.78 

(4.48) 

9.87 

(4.48) 

5.50 

(4.48) 

4.77 

(4.48) 

8-13 -fan 

terrain 

2,108 

30.42 

(7.41) 

36.78 

(7.41) 

6.72 

(7.41) 

15.38 

(7.41) 

alt.  level 

5,108 

28.17 

(4.48) 

28.17 

(4.48) 

28.27 

(4.48) 

58.38 

(4.48) 

*  Calculated  F-value  (Critical  F-value  for  p  =  .001) 

A  significant  difference  at  the  .001  level  occurs  when 
Fcritical  <  Fcalculated* 
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1.  OVERVIEW  OF  THE  SOFTWARE  PACKAGE 

The  software  package  developed  by  Bedford  Research  Associates 
processes  irregularly-spaced  satellite  measurements  of  ion  density  for  the 
Air  Force  Global  Weather  Center  (APGWC)  and  extrapolates  them  to  a  regular¬ 
ly-spaced  grid.  The  extrapolation  procedure  uses  an  estimation  algorithm 
called  the  Best  Linear  Unbiased  Estimator  (BLUE) .  The  package  is  complete¬ 
ly  self-contained,  needs  no  outside  algorithms,  and  needs  no  input  parame¬ 
ters  except  the  satellite  data.  A  short  overview  of  the  structure  of  the 
software  package  follows.  Details  about  the  theory  of  BLUE  and  the  prac¬ 
tice  of  parameter  estimation  are  found  in  subsequent  sections. 

Figure  1-1  shows  the  overall  structure  of  the  software,  which  can  be 
divided  into  three  basic  modules  as  shown.  The  first  module  performs  the 
actual  reading  and  editing  of  the  data  file.  The  input  to  this  module  is  a 
24-hour  set  of  GWC  satellite  data  containing  values  of  longitude,  latitude 
and  density.  The  data  are  subdivided  into  6  AM  and  6  PM  data  files,  ac¬ 
cording  to  the  local  times  along  the  satellite  path  at  which  the  measure¬ 
ments  are  made.  AM  data  occur  when  the  satellite  is  travelling  south  to 
north,  while  PM  data  are  acquired  north  to  south.  The  grid  to  which  the 
data  is  extrapolated  covers  0°-80°N  latitude  with  a  delta  of  15°.  Since 
the  range  of  the  GVC  data  is  approximately  82°S-82°N,  and  0°-360°  longi¬ 
tude,  all  data  not  in  the  grid  range  are  eliminated.  Finally,  the  data 
files  are  further  reduced  by  sampling  every  fourth  observation  of  each 
file.  This  sampling  is  required  to  meet  constraints  imposed  by  memory  and 
computer  time  considerations .  The  maximum  number  of  points  which  can  cor¬ 
rectly  be  accepted  by  BLUE  is  100;  the  sampling  just  described  results  in 
approximately  ninety  points  per  file. 
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Input  GHC  Satellite  Data, 

24  hrs  containing  longitude, 
latitude  and  density  values. 

* 

Breakdown  data  into  AM  and  PM 
files,  north  to  south  orbit  is 

PM  and  south  to  north  is  AM. 

Data  in  the  latitude  range 

0-80°  N  is  kept. 

*  .  ... 

Each  file  contains  ~ 350pts. 

Reduce  by  sampling  every  4th 
point  to  get  a  file  with  less 
than  100  pts.,  the  maximum  num¬ 
ber  BLUE  can  take  in. 

Fit  a  Cubic  Least  Squares  approx¬ 
imate  to  each  file  with  respect  to 
latitude  and  retain  the  coefficients. 
- * - 


Feed  the  2  files  and  their  respective 
cubic  fit  coefficients  into  the  BLUE 
algorithm.  Two  methods  can  be  imple¬ 
mented  at  this  point  —  Covariance  or 
Semi-variogram. 


Upon  output  from  BLUE  we  have  2  data 
files,  AM  and  PM,  with  estimates  at  the 
grid  locations  defined,  0°-360o  longi¬ 
tude  every  15°  and  0°-80°N  latitude 
every  10  .  


Extend  the  AM  and  PM  grids  to  24  LT 
grids,  one  grid  per  LT  hour,  by  a 
weighted  combination  of  the  2  grids. 
The  weights  are  derived  by  a  time 
function  supplied  by  the  initiators. 

♦ 

« _ 

Re-index  24  LT  grids  into  24  ITT  grids 
to  be  caipatible  with  existing  soft¬ 
ware  and  output  the  grids  to  disk. 

System 

Disk 

bO 


FIGURE  1.1:  Software  Package:  Basic  Structure 
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The  second  module  executes  the  actual  BLUE  algorithm  to  extrapolate 
the  reduced  satellite  readings  to  the  6AM  and  6PM  grids.  Two  versions  of 
the  second  module  have  been  developed,  employing  slightly  different  model¬ 
ling  assumptions.  The  first  version,  which  is  currently  operational  at 
APGWC,  assumes  that  a  trend  or  mean  structure  in  the  ion  density  field  can 
be  subtracted  from  the  observations  prior  to  performing  the  extrapolation. 
The  trend  is  restored  to  the  estimated  grid  values  at  the  end  of  the  BLUE 
algorithm.  The  second  version,  currently  under  development,  applies  the 
BLUE  algorithm  to  the  actual  observations  without  prior  subtraction  of  the 
trend,  but  adds  a  constraint  that  ensures  consistency  between  the  mean  of 
the  observed  values  and  the  mean  of  the  estimated  grid  values.  In  addi¬ 
tion,  the  first  version  employs  the  covariance  function  to  model  the 
spatial  correlation  of  the  density  field,  while  in  the  second  version  the 
semi-variogram  function  is  being  tested  as  a  possible  correlation  model. 
The  theory  and  modelling  techniques  used  in  the  first  version  are  presented 
in  Section  2,  while  Section  3  discusses  the  theory  and  modelling  being 
developed  under  the  second  version.  In  both  versions,  the  final  output  of 
the  second  module  is  a  uniform  grid  of  unbiased,  optimal  estimates  of  ion 
density  for  each  set  of  6AM  and  6PM  observations. 

In  the  last  module,  the  6AM  and  6PM  grid  estimates  are  extended  to  a 
24-hour  period,  providing  estimated  grids  at  one-hour  increments  of  local 
time.  A  function  defining  the  variation  of  density  values  with  local  time, 
shown  in  Figure  1-2,  is  used  to  derive  a  set  of  time-dependent  weights. 
These  weights  are  used  to  extend  the  6AM  and  6PM  files  to  24-hour  local 
time  files.  The  time  function  currently  used  was  arrived  at  through  dis¬ 
cussion  with  the  initiators.  All  points  on  the  grid  are  assumed  to  vary 
equally  with  time.  This  relatively  crude  approximation  reflects  currently 
available  knowledge  and  data  on  time  variation,  and  refinement  is  planned 
as  more  information  becomes  available. 
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2. 


MODEL  ASSUMPTIONS  AND  THEORY:  BLUE  VERSION  1 


As  noted  in  Section  1,  the  version  of  the  software  currently  opera¬ 
tional  at  AFGWC  varies  somewhat  from  the  routines  now  under  development  for 
subsequent  implementation  at  AFGWC.  The  difference  lies  in  the  second  mod¬ 
ule  which  performs  the  actual  estimation  of  the  density  field  at  grid  loca¬ 
tions.  In  both  versions,  the  trend  or  mean  structure  of  the  density  field 
is  assumed  to  vary  with  latitude  as  a  cubic  polynomial. 

mean:  m(u)  =  E  [Z(u)J  =  PQ  +  p  -^y  +  P2y2  +  03y3 


where  E  [ . ]  is  the  expectation  operator 
Z(v.)  is  the  ion  density  field 

PQ,  Pi>  P 2  @3  are  ^  cvbi0  polynomial  coefficients 

In  Version  1,  this  cubic  trend  is  subtracted  from  the  observations 
and  the  residual  field  thus  obtained  used  to  estimate  the  grid  values.  The 
trend  is  restored  to  the  estimates  at  the  end  of  the  BLUE  algorithm.  The 
second  version  leaves  the  trend  in  the  observations  while  imposing  a  con¬ 
straint  to  ensure  consistency  in  the  mean  of  the  final  estimated  values. 
Section  2.1  presents  the  model  assumptions  and  theory  underlying  the  BLUE 
algorithm  as  implemented  in  Version  1,  and  Section  2.2  discusses  the  corre¬ 
lation  model  fitting  techniques.  The  corresponding  topics  for  Version  2 
are  discussed  in  Section  3. 


2.1  BLUE  ASSUMING  STATIONARY  MEAN 

In  the  original  implementation  the  BLUE  algorithm  was  performed  with 
the  residual  field 


Y(ii)  =  Z(ii)  -  m(ii) 
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The  known  second  moment  characteristic  of  Y(u)  are 


mean:  E  (Y(u)  ]  =  0 


covariance:  Cov  =  Cov  (h)  =  E  [Y^)  Y(u2)] 


where 


h  =  %  "  ii2 


The  linear  estimator  of  a  residual  value  Y  (ju^)  on  the  grid 


Y  (Oq)  -  2  YCu±) 

i=l 

given  the  mean-subtracted  observation  set,  Y(u^) ,  i=l,2,...N 
Unbiasedness  requires  that 
E  [Y  CUq)  ]  =  E  [YO^)] 
which  reduces  to  the  condition: 


Ai  =  l 
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The  estimation  variance  is 


*  Var  [YO^)  -  YOg  ]  =  Var  U  X.  Yfc^)  -  Y^)  ] 

i=l 

N 

=  Cov  (0)  -  2  I  XA  Cov  (hio) 
i=l 

N  N 

+  S  2  Ai  Xj  Cov  (hi;j) 

i=l  i=l 

N 

Minimizing  the  above  subject  to  the  condition  X^  =  1  yields 

the  following  set  of  N+l  linear  equations  in  the  X^'s  and  the  Lagrange 
multiplier,  m  . 

N 

£  X j  Cov  (h^j)  +  m  =  Cov  (hio) 
j=1  i=l,2. . .N 


A 1 

a2 

• 

• 

• 

£  * 

Cov  (h1Q) 

• 

• 

• 

• 

XN 

1 

• 

°°» w 

i 

K  is  symmetric  (N+l)  x  (N+l)  matrix  and  the  solution  requires  its  inversion 
yielding 

A  =  K"1  £ 

The  BLUE  of  Z^)  can  hence  be  formed  from  the  optimal  set  of  A^s. 
Also,  the  optimal  estimation  variance  is 

2  N 

o  ■  Cov  (0)  -  1  A  .  Cov  (h-  )  -  M 
e  i=l  1  30 

Note  that  the  matrix  K  need  be  computed  and  inverted  only  once  for 
all  points  on  the  grid  to  be  estimated,  since  the  same  set  of  observations 
is  used  in  estimating  the  whole  grid.  Only  the  vector  £  varies  from  one 
grid  value  to  the  next. 


2.2  MODEL  PARAMETER  ESTIMATION:  ESTIMATING  THE  EXPONENTIAL  COVARIANCE 

The  BLUE  algorithm  currently  implemented  on  the  UNIVAC  conputer  at 
APGWC  performs  a  minimum  variance  estimation  using  a  linear  combination  of 
known  observations  to  produce  unbiased  estimates  of  the  unknown  grid  loca¬ 
tions.  This  estimation  procedure  requires  prior  knowledge  of  the  spatial 
correlation  of  the  ion  density  data.  This  version  of  BLUE  models  spatial 
correlation  in  terms  of  the  exponential  covariance  function 

Cov  (Up  ii2)  =  °2  exP  (-ah) 
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where 


%'  %  are  t*ie  coordinates  of  the  two  points  under  con¬ 
sideration 

h  ”  |  %  -  “2  | 
a  =  exponential  parameter 
2 

o  =  point  variance  of  the  ion  density  field 

Inplementation  of  the  BLUE  algorithm  then  requires  the  prior  estima- 
tion  of  a  and  the  point  variance  a  • 

2 

The  point  variance  o  can  be  easily  estimated  on-line  from  each  set 
of  6AM  and  6PM  observations  as  they  become  available.  Hie  value  a  was 
estimated  from  the  data  available  during  model  development.  This  value  was 
arrived  at  through  trial  and  error  and  a  procedure  whereby  each  observed 
value  is  treated  as  an  unknown  point  and  estimated  using  the  others.  Sta¬ 
tistics  formed  from  the  difference  between  the  estimated  and  actual  ob¬ 
served  values  are  used  as  measures  of  the  goodness  of  the  assumed  value  of 
a.  Trial  and  error  show  that  the  value  a  =  0.2  yielded  the  best  values  for 
these  statistics  and  this  value  has  been  incorporated  into  the  software  now 
operational  at  AFGW2. 

It  is,  of  course,  to  be  expected  that  each  set  of  6AM  or  6PM  observa¬ 
tions  may  yield  a  different  correlation  structure.  The  new  version  of  BLUE 
to  be  implemented  will  incorporate  procedures  to  estimate  on-line  the 
parameters  of  the  correlation  function  (the  semivariogram  in  the  new  ver¬ 
sion)  for  each  set  of  6AM  or  6PM  observations. 
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3.  MODEL  ASSUMPTIONS  AND  THEORY:  BLUE  VERSION  2 


Tto  take  care  of  variation  in  the  spatial  correlation  structure  of  the 
density  field  from  one  12-hour  observation  period  to  the  next,  on-line 
estimation  of  correlation  parameters  for  each  set  of  6AM  or  6PM  observa¬ 
tions  was  deemed  necessary.  Preliminary  analysis  with  available  data  also 
points  to  the  advantage  of  using  the  semivariogram  (defined  below)  to  mea¬ 
sure  spatial  correlation.  In  addition,  theoretical  considerations  as  well 
as  evidence  of  increased  precision  in  practice,  suggest  that  the  observa¬ 
tion  set  should  be  used  directly  (i.e.  without  subtracting  the  mean)  to 
estimate  the  unobserved  grid  values.  The  above  changes  are  being  incorpo¬ 
rated  in  a  new  version  of  the  BLUE  algorithm  (the  second  module  of  the 
software  package)  and  analysis  of  their  validity  is  in  progress.  Section 
3.1  presents  the  BLUE  algorithm  using  the  actual  observation  set  (rather 
than  the  residual  set)  while  Section  3.2  discusses  model  fitting  techniques 
for  the  semivariogram  currently  under  development. 


3.1  BLUE  ASSUMING  KNOWN  NUN-STATIONARY  MEAN 

,  o 

The  ion  density  field  Z(u)  is  assumed  to  be  a  random  process  in  ft 

with  known  second-moment  characteristics.  The  mean  expressed  as 

E  (Z(ii)  J  -  mill) 

is  known  for  all  points  in  the  area  of  interest  (i.e.,  all  points  on  the 
grid  where  values  are  to  be  estimated) ,  and  in  this  case  a  cubic  polynomial 
varying  with  latitude  only.  The  semivariogram  defined  as 

y  (h)  =1  Var  IZOi.)  -  z(u?>  J 
2 


where 


h  = 


%  -*2 
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is  also  known.  In  our  case,  the  semivariogram  is  isotropic  and  spherical 
in  structure  with  parameters  that  may  be  estimated  (see  Section  3.2) . 

Given  a  set  of  observations  ZQi^),  i=l,...N,  it  is  desired  to 
estimate  values  on  an  unobserved  regular  grid.  Let  Z^)  be  a  point  on  the 
grid. 


lhe  linear  estimator  of  Z (Uq)  is  formed  as  follows: 

A  N 

Z  (1L)  =  I  Z^) 

i=l 

Unbiasedness  is  maintained  by  requiring 
E  (Z (j^)]  =  E  tZ(Uo>] 

which  is  re-expressible  in  terms  of  the  known  means  as 
N 

1  Ai  m(u^)  =  m^) 
i=l 

The  estimation  variance  is  expressed  as 

Var  [Zdip)  -  ZOJq)} 

N 

=  Var  [  I  X.  z(Ui>  -  Z(Uo)] 
i=l 

Mather on  (1970)  has  shown  that  the  above  variance  is  finite  only  if  the 
following  condition  on  the  weights,  X  holds: 

N 

Z  Ai  =  1 
i=l 
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Given  the  above  condition ,  the  estimation  variance  can  then  be  ex¬ 
pressed  in  terms  of  semivar  iograms  as 

N 

Var  [  2  X£  Zdl*)  -  Zd^)  J 
i-1 

N  N  N 

-  -  £  2  7  (h.j)  +2  2  X  A  7  (h^) 

i*=l  j*l  i»l 

The  "best"  criterion  requires  that  this  esimation  variance  be  mini¬ 
mized  subject  to  the  two  constraints  on  the  X  A  Lagrange  multiplier 
formulation  yields  the  following  set  of  N+2  linear  equations  in  the  X  ^»s 
and  the  multipliers,  pi  Q  and  p^. 

N 

2  X  j  7  (Uj  -  Ui)  +  »  I  +  1*0^)  *  y  CUi  -  Uo) 

j=l 

i  =  1,2, ...N 
N 

2  \i  «=  1 

i=l 

N 

2  Xi  ni  (ju-i)  *  md^) 

i=l 

Or  in  natrix  formulation 


and  the  BLUE  of  ZGu^)  can  then  be  formed  from  the  optimal  set  of 
X  iSl(2a  > 

In  addition,  the  minimum  estimation  variance  is: 

2  n 

°e  “  1  xi  7  (ill  "  iio)  +  M  x  mOio) 

i=l 

The  above  algorithm  is  implemented  for  each  point  on  the  regular 
grid.  Since  the  same  observations  ZQi^),  i=l,...N  are  used  for  estimating 
every  point  on  the  grid,  it  is  apparent  that  the  matrix  G  needs  to  be  in¬ 
verted  only  once  and  only  the  R.H.S.  vector  H  varies  from  point  to  point 
on  the  grid  yielding  a  different  set  of  optimal  X^'s  for  each  grid  to  be 
estimated. 

3.2  MODEL  PARAMETER  ESTIMATION:  ESTIMATING  THE  SEMIVARIOGRAM 

The  BLUE  algorithm  requires  the  specification  of  a  model  for  the  mean 
structure  (or  trend)  of  the  ion  density  field  and  a  correlation  structure 
in  terms  of  either  a  covariance  or  semivariogram.  In  the  new  version  of 
the  software,  a  cubic  polynomial  which  is  a  function  only  of  the  latitude 
is  assumed  for  the  trend  and  the  form  of  the  spatial  correlation  is  speci¬ 
fied  in  terms  of  an  isotropic  spherical  semivariogram.  The  respective 
equations  are: 

mean:  m(u)  *  E  [Z(u)J 

m(u)  =  0O+  V+  V2*  V 

semivariogram:  7  (h)  =  ) 

1 

=  -  Var  (Z  (i^)  -  Z  (u2)  ) 

2 

i 

-  -  E  l(Y  Glj)  -  Y  (il2)  )2  j 
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where  ZQi)  is  the  ion  density  field 

YQi)  is  the  residual  field 
YQi)  =  Z(ii)  -  m  (u) 
and 

Cu.)  =  (x,  y)T  are  the  geographical  coordinates 


h  is  a  distance  measure  computed  as  if  the  points  Qi.)  lie  on  a 
cylinder  (instead  of  a  sphere) .  This  model  assumption  has  the  effect  that 
observation  points  near  the  North  Pole  which  lie  on  different  orbits  are 
pushed  apart  so  that  the  effective  correlation  across  orbits  is  smaller 
than  would  otherwise  be  the  case  if  actual  distances  on  a  sphere  are 
assumed. 


$0,  $2'  02'  P3  =  coefficients  of  the  cubic  trend 

Co,  C,  a  =  parameters  of  the  spherical  semivariogram 
a  is  known  as  the  range 
Co  the  nugget  effect 
and  Co  +  C  the  sill. 


These  parameters  are  illustrated  geometrically  in  Figure  3-1. 
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Figure  3-1:  Spherical  Semi variog ram 

The  model  assumption  of  a  cubic  polynomial  varying  with  latitude  for 
the  mean  structure  was  suggested  by  ion  density  data  currently  available 
and  confirmed  by  AFOC.  Tests  with  available  data  also  sugest  that  the 
form  of  the  spherical  semi  variog ram  adequately  models  the  residual  correla¬ 
tion.  Many  physical  phenomena  in  <f? 2  are  also  known  to  exhibit  the  spheri¬ 
cal  semivariogram  structure  (see  David  [1977]  and  Chua  [1980])  and  the 
parameters  are  amenable  to  automatic  parameter  fitting  procedures. 

Although  the  form  of  the  trend  and  semivariogram  is  assumed  not  to 
vary  from  one  12-hour  period  (6AM  or  6PM)  to  the  next,  the  parameters 
0i  (i=0,...3)  and  Co,  C,  and  a  will  be  estimated  for  every  set  of  6AM 
and  6PM  observations. 

The  coefficients  0i  (i=0, . .  .3)  will  be  estimated  using  a  simple 
least  squares  regression. 

With  the  trend  subtracted  from  the  observations,  the  residuals  are 
now  used  to  estimate  the  semivariogram  parameters.  The  procedure  imple¬ 
mented  here  is  a  modification  of  the  method  suggested  by  David  (1977) . 


# 


66 


* 


An  experimental  or  "raw"  semivariogram  is  first  confuted  from  the  set 
of  6AM  (or  6PM )  observations  under  consideration.  Since  the  observations 
are  irregularly  distributed  with  respect  to  the  distance  measure  h,  the 
estimator  of  y  (h)  is  the  following: 


n 


y  (h)  =  -  Ih  {YQli  +  h)  -  Y  (Uj)  l 

•  ^  ' 


2nv 


i=l 


where  pairs  of  observations  are  grouped  by  discrete  distance  intervals,  say 
n^  pairs  between  h  and  h+  A  h  and  the  value  h  in  the  above  equation  is 
an  average  value 


1 

h  =  - 
% 


£h  hi 


i=l 


The  h  value  to  be  used  in  the  current  implementation  is  Ah  =5.0 
which  has  shown  to  provide  sufficient  discretization  to  define  the  raw 
semivariogram  structure. 


To  fit  the  spherical  semivariogram,  the  sill  is  first  established 
from  an  estimate  of  the  variance  of  the  field 


Co  +  C  =  v 
Z 

A  weighted  least  squares  regression  is  then  performed  to  fit  the 


rising  part  of  the  semivariogram.  n  points  on  the  experimental  semi- 


,  th 


variogram  are  selected  for  this  regression  where  the  (nQ  -  3)  point  is 


the  first  raw  semivariogram  value  to  exceed  the  sill  value,  Co  +  C.  The 
regression  will  yield  the  value  of  the  nugget  effect  Co  and  the  inter¬ 
section  of  the  regression  line  and  the  sill  Co  +  C  defines  the  point 


The  estimated  parameters  are  the  input  to  the  BLUE  algorithm  which 
estimates  a  regular  grid  of  density  values  based  on  the  estimated  parame¬ 
ters  and  assumed  model  for  trend  and  semivariogram. 

3.3  OPTIONAL  MODEL  VALIDITY  CHECK 

Note  that  software  is  also  available  to  test  the  validity  and 
accuracy  of  the  assumed  models  and  the  estimated  parameters.  Since  the 
modelling  assumptions  of  the  form  of  the  trend  and  semivariogram  were  made 
on  the  basis  of  the  data  currently  available  to  the  analysts,  the  uncer¬ 
tainty  obviously  exists  whether  the  same  assumptions  will  apply  to  the 
on-line  data  to  be  collected.  To  test  these  assumptions ,  a  program  is 
available  which  takes  each  set  of  6AM  or  6PM  observations  and  the  model 
parameters  estimated  from  them  and,  by  a  process  of  estimating  each  obser¬ 
vation  using  the  others,  establishes  a  set  of  statistics  that  measure  the 
validity  of  the  model  and  model  parameters  used.  This  program  can  be 
executed  if  a  poster iti  checks  on  model  assumptions  and  re-adjustment  of 
parameters  are  deemed  necessary. 
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ICM  ENERGY  RTJTTOfJJ  AND  PHOTON  CROSS  SECTIONS 
£Q&  Q,  M 2#  Q2  AMD  REIATED  DMA 


I 


A  set  of  low  energy  (0-300  eV)  cross  sections  of  functions  of  energy 
and  related  date  were  compiled  for  use  in  calculating  the  photoelectron 
flux  in  the  daytime  ionosphere  of  the  earth. 

The  upper  atmosphere  is  composed  of  atoms,  sinple  molecules,  and 
their  ions.  When  the  sun  is  relatively  quiet,  the  atmosphere  can  be 
described  and  modeled  as  being  in  the  steady  state.  Sairple  data  on  the 
neutral  particle  conposition  of  the  upper  atmosphere  and  ionizing  electro¬ 
magnetic  radiation  from  the  sun  were  included  (e.g.,  values  for  neutral, 
ion,  and  electron  densities  and  the  associated  tenperatures  found  in  the 
mid-latitude  daytime  ionosphere ,  and  tables  of  the  energy  levels  of  the 
main  species  in  the  upper  atmosphere) . 

The  photo  ionization  and  photodissociation  of  the  atmosphere  species 
are  the  means  of  putting  additional  energy  into  the  upper  atmosphere.  This 
is  done  by  the  extreme  ultraviolet  (EUV)  region  of  the  spectrum.  The  EUV 
spectrum  is  composed  primarily  of  lines  and,  when  examined  in  great  detail, 
is  highly  structured.  It  must  be  measured  above  the  atmosphere  of  the 
earth.  Two  sairple  measured  solar  spectra  were  included. 

The  ionospheric  density  is  low  enough  for  sinple  photoabsorption  to 
result  in  re-emission  and  the  total  process  resembles  elastic  scattering. 
The  main  interest  is  in  the  energetic  electrons  produced  through  photo¬ 
ionization.  In  the  energy  range  in  which  photoionization  is  possible,  ab¬ 
sorption  must  also  be  considered.  Photoionization  and  the  photoabsorption 
cross  sections  for  0,  Nj,  and  C>2  were  chosen  for  wavelengths  from  34  to 
103QA. 
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The  recombination  rates  of  atmospheric  species  are  measured  directly 
in  laboratory  experiments  or  determined  from  measured  cross  sections.  The 
analysis  of  atmospheric  data  provides  a  check  on  the  rates  and  assumptions 
involved.  Hie  cross  sections  for  dissociative  recombination  are  well 
behaved  near  .1  eV.  Using  this  as  the  main  part  of  the  cross  sections 
allowed  the  rates  to  be  calculated. 

The  experimental  total  cross  sections  for  ionization  of  0,  N2,  and  02 
were  reviewed.  One  difficulty  in  the  use  of  these  is  that  the  energy  of 
the  other  electron  is  undetermined  since  the  ionic  state  and  its  excitation 
energy  are  unknown. 

The  electron-neutral  particle  cross  sections  must  be  known  in  order 
to  calculate  the  electron  distribution  function  for  the  Earth's  ionosphere. 
The  variety  of  states  and  cascading  processes  make  experimental  identifica¬ 
tion  of  the  many  cross  sections  difficult,  especially  for  the  higher  lying 
states.  As  the  energy  increases,  many  states  may  be  surnned  over:  rotation¬ 
al  states  for  vibrational  excitation;  vibrational  (and  rotational)  states 
for  electronic  excitation;  rotational  (and  possibly  vibrational)  virtual 
states  for  resonances.  At  energies  high  enough  to  allow  the  excitation  of 
several  different  levels,  the  total  cross  section  must  be  broken  down  into 
the  sum  of  the  separate  cross  sections.  This  is  a  difficult  task  and  may 
cause  ambiguities  in  the  final  result. 

At  the  very  low  energies,  measurements  of  the  cross  sections  are 
difficult  and  much  data  comes  from  calculations.  Due  to  the  complexity  of 
multi-electron  molecules,  approximations  in  the  calculations  must  be  used. 

The  differential  cross  sections  have  been  measured  for  a  number  of 
the  scattering  interactions  at  specific  energies.  Since  few  different  ener¬ 
gy  values  were  taken,  in  general,  the  report  did  not  consider  the  angular 
dependence  of  cross  sections  except  the  implied  dependence  in  the  momentum 
transfer  cross  sections. 


Hie  momentum  transfer  (or  diffusion)  cross  section  is  one  of  the 
series  of  integral  cross  sections  which  are  weighed  by  a  function  of  the 
scattering  angle  arising  in  an  angular  decomposition  of  the  Boltzmann 
collision  integral.  It  can  be  obtained  by  a  weighted  integral  of  the 
differential  elastic  cross  section  obtained  from  experiment  or  theory. 
Some  experiments  obtain  the  momentum  transfer  cross  section  directly,  e.g. 
swarm  experiments. 

Hie  cross  sections  for  transfers  between  very  low-lying  states,  such 
as  rotational  states,  are  hard  to  determine  either  experimentally  or 
theoretically. 

Hie  ground  state  of  oxygen  has  a  fine  structure  of  three  levels  being 
at  energies  of  0,  .019,  and  .029  eV  or  of  0,  156,  and  235°  K.  No  direct 
experimental  data  is  available.  Comparison  of  calculations  showed  agree¬ 
ment. 

Hie  separation  of  successive  vibrational  energy  levels  is  usually  in 
the  vicinity  of  .2  eV  or  2000°  K.  In  the  ionosphere  this  is  high  enough  so 
that  the  excited  levels  are  relatively  depopulated  but  low  enough  so  that 
excitation  by  energetic  electrons  provides  a  sink  for  the  electron  energy, 
especially  in  the  case  of  Nj.  Hiere  have  been  many  calculations  of  the 
separate  vibrational  excitation  cross  sections.  Hie  calculation  chosen 
agrees  well  with  the  experimental  general  structure  of  the  separate  cross 
sections  and  is  about  10%  larger  than  the  measured  absolute  total  cross 
section. 

Hie  electronic  states  of  the  atmospheric  species  start  at  an  energy 
of  a  few  electron  volts.  These  states  are  not  thermally  excited.  Hiey  do 
form  a  sink  for  the  energy  of  the  photoelectrons  through  excitation  by  the 
electrons  and  subsequent  radiative  decay.  It  is  at  these  energies  that  the 
separation  of  the  effects  from  different  states  becomes  difficult.  As  a 
result,  the  determination  of  a  particular  cross  section  at  a  particular 
electron  energy  depends  on  knowing  the  others  or  on  having  a  method  of 
separating  them.  Comparing  sets  of  cross  sections  with  other  data  which 
can  be  connected  with  them  (e.g.  electron  transport  coefficients)  allows 
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determination  of  a  set  which  is  self-consistent  in  a  given  context.  If  the 
cross  sections  are  to  be  used  to  model  a  similar  situation,  such  a  set 
should  give  good  results  if  it  is  sufficiently  conplete.  Such  sets  were 
determined  using  electron  transport  coefficients  for  C>2  and  N2.  These  sets 
served  as  a  guide  in  making  a  choice  between  various  experimental  cross 
sections . 

Atcsnic  oxygen  is  difficult  to  handle  experimentally  because  it  is  so 
reactive.  The  theoretical  approach  is  also  difficult  due  to  the  need  to 
properly  represent  the  open  shell  configuration  of  the  atom,  its  inter¬ 
actions  with  the  incoming  electron  including  exchange  and  polarization,  and 
the  effects  of  low-lying  states  of  the  atom  and  its  negative  ion,  i.e., 
excitation  and  resonances. 

The  cross  sections  considered  above  described  the  collision  of  an 
electron  and  an  atom  or  molecule  in  which  the  electron  loses  some  of  its 
energy.  The  opposite  process,  in  which  the  electron  gains  energy  through 
excitation  of  the  atom  or  molecule,  was  related  to  the  first. 

Molecular  dissociation  by  electron  impact  can  be  considered  as  a 
single  complex  process.  The  total  cross  section  does  not  determine  the 
electron  energy  in  the  three  particle  final  states  and  more  structure  must 
be  included  either  experimentally  or  theoretically.  It  is  natural  to 
divide  the  process  so  that  the  significant  parameters  are  excitation  cross 
sections  for  the  different  molecular  states  and  branching  ratios.  Below 
the  lowest  dissociation  energy  the  states  cannot  dissociate;  above  it, 
dissociation  may  be  probable  or  not  depending  on  the  relative  accessibility 
of  dissociated  and  nondissociated  states.  Rydberg  states,  which  could  not 
ionize,  were  considered  to  dissociate,  as  were  all  the  02  states  for  which 
dissociation  was  energetically  possible. 
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